환각 (AI)

생성형 AI의 헛소리를 “환각”이라고 부르기도 한다. 하지만 이는 잘못된 용어 사용이다. See ChatGPT is bullshit.

Classifications

Closed-domain vs. open-domain:1

  • Closed-domain hallucinations refer to instances in which the model is instructed to use only information provided in a given context, but then makes up extra information that was not in that context. For example, if you ask the model to summarize an article and its summary includes information that was not in the article, then that would be a closed-domain hallucination.
  • Open-domain hallucinations, in contrast, are when the model confidently provides false information about the world without reference to any particular input context.

Intrinsic vs. extrinsic:2

  • Intrinsic Hallucinations: The generated output that contradicts the source content. For instance, in the abstractive summarization task from Table 1, the generated summary “The first Ebola vaccine was approved in 2021” contradicts the source content “The first vaccine for Ebola was approved by the FDA in 2019.”.
  • Extrinsic Hallucinations: The generated output that cannot be verified from the source content (i.e., output that can neither be supported nor contradicted by the source). For example, in the abstractive summarization task from Table 1, the information “China has already started clinical trials of the COVID-19 vaccine.” is not mentioned in source. We can neither find evidence for the generated output from the source nor assert that it is wrong. Notably, the extrinsic hallucination is not always erroneous because it could be from factually correct external information. Such factual hallucination can be helpful because it recalls additional background knowledge to improve the informativeness of the generated text. However, in most of the literature, extrinsic hallucination is still treated with caution because its unverifiable aspect of this additional information increases the risk from a factual safety perspective.

Input, context, fact:3

  • Input-conflicting hallucination, where LLMs generate content that deviates from the source input provided by users;
  • Context-conflicting hallucination, where LLMs generate content that conflicts with previously generated information by itself;
  • Fact-conflicting hallucination, where LLMs generate content that is not faithful to established world knowledge.

See also

Footnotes

  1. GPT-4 technical report

  2. Survey of hallucination in natural language generation

  3. Siren’s song in the AI ocean: A survey on hallucination in large language models

2024 © ak